
|
Overview

|

|
With their business value proven, and more choices available in the market, the use of data warehouse (DW) appliances is surging in the market. Because of their potential value, organizations must investigate DW appliances as an alternative or addition to a custom DW environment.
- DW appliances deliver easy to install, easy to manage DW platforms with predictable performance and single point-of-contact service.
- DW appliances are used to support all types of DW, from application-specific data marts to the enterprise DW, with complex mixed workloads.
- DW appliances are available from all the incumbent database management system (DBMS) vendors, as well as many vendors with appliance-specific offerings.
- For DW platforms, DW appliances have become a competing deployment standard to traditional self-configured hardware and storage deployments.
- If you are starting to implement a new DW strategy, you must consider appliances to reduce design and configuration errors, reduce installation and setup time, achieve predictable performance and lower the total cost of ownership (TCO).
- If you already have a DW, the DW appliance can enhance service-level agreement (SLA) compliance for a specific user group or application, while allowing the DW to return to its original performance, without upgrades.
- Although a DW appliance may fix performance issues with an existing DW, do not assume it will be a panacea without first identifying the root cause of each issue.
- DW appliances are not the right fit for every situation make sure that you understand the strengths and limitations of each appliance under consideration, and always run a proof of concept.
|
|


|
What You Need to Know

|

|
DW appliances are available today from most of the vendors with a DW DBMS, as well as many vendors with only a specific DW appliance. They are being used for all types of data warehousing, from application-specific data marts to enterprise DW implementations with complex mixed workloads (see Note 1). Over the past two years, interest in and use of DW appliances have been growing fast. As the value proposition of DW appliances grows, more organizations will realize their benefits. By 2015, at least 50% of enterprises with DWs in production will include a DW appliance. This research should be of interest to CIOs, chief technology officers (CTOs), IT architecture teams and database administration teams, as well as anyone involved in the design and acquisition of a DW.

|
|


|
Analysis

|

|
We first wrote about DW appliances, and defined them (see Note 2), in 2007. This definition has not changed since, and has shown the flexibility to remain accurate while vendors adapt to new technology and adjusted market demands. At that time, there were a handful of appliances available from vendors (primarily DATAllegro since acquired by Microsoft Greenplum, HP, IBM, Netezza and Teradata). Today, nearly every DW DBMS vendor has either an appliance and a stand-alone DBMS or an appliance only. In less than three years the number has doubled to 12, not including Microsoft, which will introduce the Parallel Data Warehouse appliance later this year. Also, these numbers only include those vendors on the latest DW DBMS Magic Quadrant; there are other new entrants with products available today (such as Kickfire). Appliances have actually been around for many years. Teradata released its first appliance in 1983 (the DBC/1012 Model 1). Although it was not called an "appliance" (the term "data warehouse" was not even known), it still met our definition. Appliances have also been available in the IT industry for many years in other markets (security appliances, for example). Netezza was the first to use the term "data warehouse appliance," when it came to market in late 2002 with its NPS 8100 DW appliance.
Obviously, just because vendors have the products does not imply that customers will buy them. Marketing aside, customers find value in acquiring a complete appliance as the solution for the enterprise data warehouse (EDW), or as a component of the overall DW infrastructure. Vendors have also seen this value and this, coupled with customer demand, has encouraged them to come to market with many offerings that, by our definition (see Note 2), are considered DW appliances. In this research we will discuss the definition of a DW appliance, its variations and the value proposition for its use, and dispel many of the myths surrounding DW appliances. Finally, it should be of interest that Gartner has seen a major increase in the volume of inquires on the subject of DW appliances. These include: (1) What is a DW appliance?; (2) What is the value proposition for a DW appliance?; (3) Who are the leading vendors with DW appliances?; (4) What are the cost advantages of the DW appliance?; (5) Where does the DW appliance fit into an overall DW architecture? and (6) Why should I buy a DW appliance? In 2008, we received a few inquires per month on the topic; today we receive as many as five or more per week.
Today, many vendors offer different options with their appliances. Some offer more than one operating system (Unix, Linux or Windows); some offer several options in terms of the size and type of the storage drive, without altering the number of drives (such as SAS drives at 140 gigabytes [GB] or eSata drives at 2 terabytes [TB], per drive); and some offer completely different hardware stacks (such as availability on Dell, HP, IBM and Sun Microsystems), maintaining the proper balance in each vendor configuration. We are now beginning to see several vendors offering solid-state disks (SSDs) in their appliances. As long as these variations maintain the proper balance of hardware, storage and software, they meet the requirements of our definition.

The Data Warehouse Value Proposition
There are many benefits to using a DW appliance over a custom-built environment, each contributing to the overall value of using a DW appliance:
- Packaged, balanced configurations based on the size of the database. This removes the necessity of configuring the infrastructure (hardware, memory and storage) for a custom solution, which not only requires time and resources but is also subject to expensive mistakes. Very often an error in the configuration can cause the organization to replace expensive hardware components prematurely, or at least to spend extra resources optimizing the current configuration to increase performance. Very often this may not be possible to attain the necessary level of SLA.
- Simplified and accelerated installation and setup. This is completely a question of resources. In a custom environment this can take weeks or even months. In the case of a DW appliance this is normally minimal.
- Ease of maintenance and support through a single source. In many instances, organizations can spend days identifying the source of problems, increasing support costs. With appliances there is a single point of support, and a single call for support is all that is necessary. Furthermore, maintenance upgrades to the software are tested in a controlled environment by the vendor, reducing the possibility of issues with changes to or new versions of the DBMS.
- Simple, integrated management of the system as a single entity. Many vendors, especially those with specifically a build appliance (for example Greenplum, Netezza and Teradata), have specialized management software to manage the DW environment. Others have enhanced existing management tools to support the DW appliance (IBM and Oracle, for example). This increases the efficiency of the resources managing the DW and reduces resource cost.
- Lower number of resources required to manage the system. In almost every case, reports from Gartner clients during inquires show that fewer resources are required to manage DW appliances. Initial research into appliances and their staffing advantages in 2006 demonstrated a full-time-equivalent (FTE) ratio of nearly 4 to 1 to manage a custom-built DW compared with an appliance. More recent inquiries indicate that the ratio is now lower (approximating 2.6 to 1). Although lower, the appliances maintain an advantage due to the balanced, integrated system. Gartner clients also report that after moving an existing DW to an appliance, the number of necessary indexes drops, in some cases by more than an order of magnitude (in one case over 300 indexes dropped to just three).
- Predictable performance (in some cases with vendor-supplied SLAs). The DW appliance is sold based on the total amount of source system extracted data (SSED) to be loaded into the appliance. The configuration is balanced for performance based on this data size. The result is a predictable performance requiring less optimization and fewer upgrades to the environment to achieve the necessary performance.
- Lower total cost of acquisition (TCA) and/or TCO. Both TCA and TCO are based not only on the purchase cost, but the resources needed to configure, install and setup the DW. Furthermore, in a custom environment, many organizations will over-purchase the hardware (also raising the license cost of the software) in an effort to assure performance. TCO will also be affected by the ongoing support of the DW and the resources required to manage the environment. Altogether, the TCA and TCO of the appliance will be lower.
Based on the value of the DW appliance, we believe that appliances should be considered as an alternative to a custom infrastructure for all new DW implementations. They should also be considered: (1) As a replacement for DW infrastructure that is older or outdated and due to be replaced; (2) As an application-specific data mart if performance issues can be traced to this application running on the DW; (3) As a new data mart for a group of users or an application requiring high performance and/or strict adherence to SLAs and (4) As a general replacement for the DW if the lower TCO of the appliance shows an acceptable return on investment to the organization.

Myths About Data Warehouse Appliances
A DW appliance is a DW in a box. We continue to hear this in client inquires and continue to be surprised by it. A DW is much more than a DW appliance or a DBMS running on any hardware configuration. When a DW appliance is installed, the DBMS is running on the appliance. It does not contain your DW model, schema or data. Some vendors do offer logical data models to go along with an appliance. These are great starting points, but seldom will they address all the needs of the EDW for any organization. Furthermore, the data is not part of the model and will still need to be loaded into the DW. Even when the DW appliance is sold and delivered with data integration products installed, these products need to be set up to use your schema and data. The DW appliance will certainly simplify the system's architectural decisions in creating a DW and installing the hardware and software. However, the task of creating the DW model (or modifying a logical data model, creating the data integration templates and loading the data) is still left to the customer. There is no replacement for creating a good model and populating it with good quality data. There is one exception to this. If the organization has an existing DW running on a custom hardware configuration and the desire is to move it to a DW appliance with the same DBMS, this will certainly require far fewer resources after installation of the DW appliance. Other benefits would be more predictable performance and lower TCO. This may also be true even switching the DBMS, if the design of the DW is optimized without using specific functionality available only in the current DBMS. We have heard of these situations in client interactions, where the DW was moved to a new appliance platform (including a new DBMS) over a short period of time and was running without problems. This is also true of the DW applications running on the DW platform, as most of the applications use tools (such as business intelligence (BI) and analytics) with standard interfaces, and the underlying DBMS is less relevant. This is another advantage of a proof of concept where some or all of the applications were already tested running in the new environment.
DW appliances are not mature. First, many of today's DW appliances have been around for years. Even some of the new entrants use components that are standard, especially for the hardware. It is true that some of the appliances are based on new DBMS software and therefore have a greater risk of software problems. If that is a concern, then we would recommend using one of the more mature DBMS engines for the DW appliance. Overall, we believe that there are thousands of DW appliances in production today from a large mix of vendors, proving that they are mature. This is also shown in the current Gartner Hype Cycle, in which DW appliances are approaching the Plateau of Productivity. As to the technology, most of today's DW appliances are available on commonly available hardware technology, implying little or no risk from the hardware. We believe that DW appliances present less risk, in general, over the use of a locally architected environment for data warehousing, primarily because vendors have purposely built these appliances to support a DW environment.
DW certified configurations are just as good. These are called "validated" or "certified configurations." The concept allows the client to give the salesperson the size of the database in terabytes, along with several other parameters, which are entered into a program. The result is a configuration (that is, bill of materials) for a system that meets our definition of an appliance from a hardware and software position. An additional benefit of these certified configurations is as a tuning tool for existing DW implementations. Of course, again, the certified configuration does not address the single source of service and support. Further, there is nothing preventing the customer from changing any or all of the suggested configurations before purchasing. The concepts of validated configurations or appliance foundations are a step in the right direction toward advising clients on the configuration necessary for performance, but the configuration is still missing the SLA guarantee associated with appliances, as well as the single point of service and support.
DW appliances are more expensive. In general, after seeing many proposals for DW appliances, we would disagree. And that is just based on the price of the appliance. When we consider the extra resources needed to design, configure, install and debug a non-appliance infrastructure, clearly the cost of an appliance will be less. And since the appliance is likely to require less optimization of the DBMS for performance, this adds to the reduced costs. When taking into account the cost of the DW, a comparison of all the costs must be considered. Overall, we believe that the cost of an appliance will be less than that of a custom-built solution.
These benefits, coupled with the increased availability of DW appliances, increase the value proposition of the appliance, driving greater interest in and use of the DW appliance in the DW environment in turn driving the decision to buy an appliance. Although today we believe that about 10% of production DW infrastructure includes some form of appliance, based on our inquires with clients we see this number increasing rapidly during the next five years. This is the basis for our Strategic Planning Assumption that "By 2015, at least 50% of enterprises with data warehouses in production will include a data warehouse appliance."
 © 2010 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. or its affiliates. Reproduction and distribution of this publication in any form without prior written permission is forbidden. The information contained herein has been obtained from sources believed to be reliable. Gartner disclaims all warranties as to the accuracy, completeness or adequacy of such information. Although Gartner's research may discuss legal issues related to the information technology business, Gartner does not provide legal advice or services and its research should not be construed or used as such. Gartner shall have no liability for errors, omissions or inadequacies in the information contained herein or for interpretations thereof. The opinions expressed herein are subject to change without notice.
|
|
|
|
|

By 2015, at least 50% of enterprises with data warehouses in production will include a data warehouse appliance.
|
|

|

|
|
|
|

|
|

The modern complex mixed workload consists of the following:
- Continuous (near-real-time) data loading similar to an online transaction processing (OLTP) workload (due to the updating of indexes and other optimization structures in the data warehouse) that forces issues in summary and aggregate management to support dashboards and prebuilt reports.
- Batch data loading continues to persist as the market matures and begins to realize that not all data is required for "right time" latency, and that some information, being less volatile, does not need records refreshed as frequently as the more dynamic real-time data elements.
- Large numbers of standard reports ranging in the thousands per day requiring Structured Query Language (SQL) tuning, index creation, new types of storage partitioning and other types of optimization structures in the DW. Tactical business analytics in which business process professionals with limited query language experience use prebuilt analytic data objects with aggregated data (pre-joins) and designated dimensional drill downs (summary). They rely on a BI architect to develop commonly used cubes or tables.
- An increasing number of true ad hoc query users (data miners) with a random, unpredictable use of the data, implying a lack of ability to specifically tune for these queries.
- The use of analytics and BI-oriented functionality in OLTP applications, creating a highly tactical use of the DW as a source of information for the OLTP applications requiring high-performance queries. This is one force driving the need for high availability in the DW.
|
|

|

|
|
|
|

|
|

A prepackaged or pre-configured balanced set of hardware (servers, memory, storage and input/output [I/O] channels), software (operating system, DBMS and management software), service and support, sold as a unit with built-in redundancy for high availability positioned as a platform for data warehousing. Furthermore, it must be sold by the amount of source system extracted data (SSED) ("raw data") to be stored in the data warehouse, and not by the configuration (for example, the number of servers or the number of storage spindles). We allow some flexibility with performance criteria to facilitate vendors having several variations based on the desired performance SLAs and the type of workload targeted for the appliance. The primary concern is that the user (buyer) cannot change the configuration due to budget issues, therefore adversely affecting the performance of the appliance.
The most significant points of the definition of a DW appliance are:
- A fixed configuration of hardware with respect to servers, storage and interconnect, balanced to yield a predictable performance.
- A DBMS tuned for data warehousing.
- Built-in redundancy for high availability.
- Sold by a single source (you do not purchase the hardware from one vendor, storage from another and the DBMS from a third).
- A single point of service ("a single throat to choke").
- Configured by the amount of source SSED to be loaded into the appliance.
|
|
|